Text Treatment and Morphology in the Analysis of Danish within EUROTRA

نویسنده

  • Niels Jaeger
چکیده

T h e E U R O T R A T R A N S L A T IO N S Y S T E M consists basica lly o f a C -p rogram m e which operates a front-end program m e package cind a core P R O L O G program ­ m e, the Engine. T h e E ngine com bines w ith a series o f generators and translators w ritten by the linguist in a user language and com p iled in to P R O L O G cod e . T h e generators and the translators conta in linguistic in form ation accord in g to the linguistic specifications in the E U R O T R A m odel. T h e E U R O T R A m odel consists o f three m ain parts: analysis, transfer and synthesis. T h e m ain parts are su bd iv ided in to levels. E ach level in the E U R O ­ T R A m odel has a gram m ar (i.e. a gen erator) and betw een tw o levels there is a translator. T h e E M S, E U R O T R A M orph olog ica l structure, is the first level in analy­ sis. T h e first syn tactic level follow s im m ediately after E M S , it is ca lled E C S , E U R O T R A con stituen t structure. A fron t-end program m e package is bein g tested in E U R O T R A -D e n m a rk , from July to D ecem ber 1989. It consists o f a w ordscanner w ritten in C and an S G M L parser. S G M L is ein abbreviation o f “ S tandard G eneralized M arkup L anguage” . In the present experim ental phase the d ocu m en t w hich is go in g to b e trans­ lated has to b e w ritten in “ V I” or “ Q on e” , tw o ed itors available for U N IX in­ stallations. A “V I” d ocu m en t m ay b e form atted by “NroflF” . B u t in princip le, any text ed itor cou ld b e used for S G M L text entry, ju st as S G M L cou ld deliver an ou tpu t to any text ed itor. S G M L provides m eth ods for m arking up docu m en ts. F or instcince you can m ark up the log ica l structure o f a text (i.e . chapters, sections, paragraphs e tc .) . S G M L also provides a search -and-rep lace m echanism . W e can m ake the S G M L parser search for layout in form ation and replace it by a m arker.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parsing Danish Text in EUROTRA

The machine translation project Eurotra is described as a multi language modular translation system with 9 monolingual analysis modules, 72 bilingual transfer modules, and 9 monolingual synthesis modules. The analysis module for Danish is described as a 3 step parser with structure generation rules for immediate constituent structure, syntactic structure, and semantic structure, and translation...

متن کامل

The analysis of tense and aspect in EUROTRA

This papua plesel~ts a framework tot the modebfl~eoletic analysis of tense and aspect tonns in diseomse. It has been developed for Eurot~'a, the MT project of the Europeau Community, and has been applied to the nine Eurotra languages: English, German, Dutch, Danish, Greek, Italian, French, Spanish and Portuguese. The paper censis~s of six parts. The first presents the problem of translating ten...

متن کامل

Large Lexical European Projects and the MultilinguaI Aspect

We aim at providing an overview and a comparison of multilingual and Machine Translation (MT) related issues that emerge and that are handled within the major European projects in the lexical area (ET-7, Acquilex I and If, Multilex, Genelex, ET-10 Semantic Analysis of Cobuild, ET-10 Collocations, ET-10 Statistical Text-corpora based complements for Eurotra, Delis, Eurolang and Eagles), most of ...

متن کامل

Various Representations Of Text Proposed For Eurotra

We in t roduce severa l genera l no t ions concerning the texts and the particularities of text processing on a computer support, in relation to some problems which are specific to M(A)T. And we present the solution we have proposed for the duration of the EUROTRA project.

متن کامل

The Integration of a Part - of - Speech Taggerinto

We describe how part-of-speech information delivered by a tagger (the mpro tool) has been integrated into the alep (Advanced Language Engineering Platform) system. For this we extended an approach described within the ls-gram project, which consisted in deening the Text Handling component of alep in such a way that so-called \messy details" are handled within this subsystem, hence keeping the (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1989